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ON THE COMMUTATION OF 
GENERALIZED MEANS ON PROBABILITY SPACES 


PAOLO LEONETTI, JANUSZ MATKOWSKI, AND SALVATORE TRINGALI 

Abstract. Let / and g be real-valued continuous injections defined on a non-empty real 
interval /, and let ( X , .if, A) and (Y, «/#, g) be probability spaces in each of which there is at 
least one measurable set whose measure is strictly between 0 and 1. 

We say that (/, g ) is a (A, /r)-switch if, for every ^-measurable function h : XxY —»• R, 
for which h[X xY] is contained in a compact subset of /, it holds 

r 1 (/* / ( 9_1 (L 9oh di )) dx ) = 9-1 CC 9 ( /_1 (L foh dx ))■ 

where / 1 is the inverse of the corestriction of f to /[/], and similarly for g' 1 . 

We prove that this notion is well-defined, by establishing that the above functional equa¬ 
tion is well-posed (the equation can be interpreted as a permutation of generalized means 
and raised as a problem in the theory of decision making under uncertainty), and show that 
(/, g) is a (A, g ,)-switch if and only if / = ag + b for some «,(iER, o/ 0. 


1. Introduction 

Below, we let I C R be a non-empty interval, which may be bounded or unbounded, and 
neither open nor closed. We will need the following proposition, which is proved in Section 3 
(see Section 2 for a glossary of notation and terms used but not defined in this introduction): 

Proposition 1 . Let {S, %?, 7) be a probability space, and assume w : I —> R and h : S —> / are 
functions such that w[I\ is an interval and w o h is 'y-integrable. Then J s w o h d'y G w[I}. 

Given (S', ^,7) and w as in Proposition 1 , we denote by £’"(7) the set of all ^-measurable 
functions h : S —>• I such that w o h is 7-integrable, while we write 7^(7) for the set of all 
^-measurable functions h : S —> I for which h[S] <s I. 

Based on these premises, assume w is an injection, so that we can consider the inverse, w; -1 , 
of w. It follows from Proposition 1 that the functional 

£”'(7)-> R : i-A w -1 wohd'^j, ( 1 ) 

which we denote by $*y(w) and refer to as the w-mean relative to 7, is well-defined and its image 
is contained in I. For h £ £”'(7) we call S 7 (w)(h) the ui-mean of h relative to 7. 

The naming comes from the observation that, if I is the interval ] 0 ,00[ and w is, for some real 
p ^ 0 , the function / T R : 1 4 i p , then £“(7) is the set of all (‘if-measurable and positive) 
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functions S —> I whose p-th power is 7 -integrable, while $y(w) is the integral mean 

r(7)^R :/iH hPd 7 

When S' is a finite set, (1) gives a generalization of classical and weighted means (say, the arith¬ 
metic mean, the quadratic mean, the harmonic mean, and others) first considered, respectively, 
by A. Kolmogorov and M. Nagumo [15, 22] and B. de Finetti and T. Kitagawa [10, 14]. 

Indeed, our interest in Proposition 1 is mainly due to the following result, which also will be 
proved in Section 3. 

Proposition 2 . Let (U,£/,a) be a measure space and {V,S§,f3) a probability space, and let w 
be a continuous injection I —> R and h a function U x V —> I. The following hold: 

(i) Let w o h x be (3-integrable for every x £ U, where h x is the map V —R : y 1 —> h(x,y). 
Then the function ip : U —> R : x i-> ^Sp(w)(h x ) is well-defined and p\U\ C I. Moreover, 
if h is srf ® £%)-measurable and w o h is bounded, then ip is £?-measurable. 

(ii) Suppose that h[U x V] <s I, and let h be sY ® SS-measurable. Then <p\U\ C I, and p is 

-measurable and bounded. 

Suppose now that (U,s/,ee) and (V,&,/3) are both probability spaces, and let u and v be 
continuous injections I —>• R. By [3, Theorem 5.6.5], u~ 1 and v~ x are continuous functions too, 
and we get by [3, Theorem 5.3.10] and Propositions 1 and 2.(ii) that the functional 

TL(a ® (3) —> R : h i-» it -1 U (^ U ~ 1 (^J v 0 h da 

which we denote by $ a ,p{u, v ) and refer to as the (u, -c)-mean relative to a < 8 > (3, is well-defined. 
Then, for h G TL{a <g) /?) we call $ a ,p(u, v)(h) the (u, r>)-mean of h relative to a <g> /3. 

Remark 1. Let h € TL(a ® /3), and denote by h x , for each x G U, the mapping V —> R : y 1 —> 
h(x,y). Then, by Proposition 2.(i), the function k : U —> R : x i-a $p{v)(h x ) belongs to £ u (a), 
and it is not difficult to verify that ^S a .p(u,v)(h) = $ a (u)(k). This shows a connection between 
f$ a ,p(upv) and $ a (u) and explains, we hope, the terminology. 

With the above in mind, assume for the rest of the section that (X, Jz?, A) and (Y. .Y/.. p) are 
probability spaces, and let / and g be continuous injections / —> R. We say that the pair (/, g) 
is a (A, p)-switch if for all h £ H (A ® p) it holds: 

ZxAf,9)(h)=^A9j)(h op ), ( 2 ) 

where h op is the function Y x X —>• I : (y, x) >-»• h(x,y), or more explicitly (it is just a question 
of unpacking the relevant definitions): 

r 1 {L f ( 9 "(X 9 0 '* <i '‘)) dx )= 9 ~VX r ' (X fohdx )) d9 )' <3) 

note that h £ H (A (g) p) if and only if h op £ TL(p ®\), as is necessary for (2) to make sense. 

It seems worth observing that if / and g are both equal to the identity function x 1 —> x 
then (3) boils down to an instance of Fubini’s theorem, and the same is true, more in general, 
whenever / and g are affine functions (see Lemma 1). 
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In particular, (/, g) is called a discrete (A, fi)-switch if it is a (A, /z)-switch, X and Y are finite 
sets, and and are the powersets of X and Y (i.e., discrete sigma-algebras), respectively. 

It is straightforward from (3) and the definitions (we omit further details) that, if X and Y 
are non-empty finite sets, and is a numbering of X and (yi)i<i< n a numbering of Y, 

then (/, g) is a discrete (A, /z)-switch if and only if 

(«) 

for every ra-by-n matrix {£ij)i<i< m ,i<j<n with entries in /, where for every i = 1 ,... ,m and 
j = 1 ,.. . ,n we set A i := A({xi}) and jij := p({yj})', if m,n > 2, we may also say that (f,g) is 
a (Ai,..., A m _i; fii, ..., g n -\)-weighted switch. 

In the present work, (3) and (4) are essentially regarded as functional equations in the un¬ 
knowns / and g , and the main question we address can be loosely phrased as: Is there any 
nice characterization of (A, /z)-switches? An answer is given by the following result, which is the 
main contribution of the paper and will be proved in Section 4: 

Main Theorem. Assume that A and p are non-degenerate probability measures. Then (/, g) 
is a (A, fT)-switch if and only if f = ag + b for some a, b £ R, a + 0 . 

The investigation of functional equations involving generalized means dates back at least to 
the work of G. Aumann [2] on the so-called “balancing property”, and it has been the subject of 
intense research for about eighty years, see, e.g., [11, Chapter III], [1, Chapter 17], [13], [17, 18], 
[19], [23], and references therein. 

On the other hand, a “practical motivation” for being interested in equation (2) comes from 
the study of certainty equivalences, a notion first introduced by S. H. Chew [ 6 ] in relation to 
the theory of expected utility and decision making under uncertainty; the reader may refer to 
[9] and [25] for current trends in the area and a survey of the literature on the topic, and to [1, 
Section 7.3 and Chapters 15, 17, and 20] for further reading. 

On top of that, the study of the functional equation (2) fits in the mathematical literature on 
permutable mappings. The research on the topic essentially started in the 1920s, with G. Julia’s 
memoire [12] and J. F. Ritt’s subsequent work on permutable rational functions [24], 

The field is still active, particularly due, on the one hand, to a number of open problems and 
important conjectures in fixed point theory, and on the other to various intersections with the 
study of dynamical systems, see, e.g., [4], [7], [21], and references therein. 

2. Notation and conventions 

Through the paper, the letters i, j , m and n stand for positive integers (unless otherwise 
noted), and R is the set of real numbers (endowed with its usual structure of ordered field). 

We refer to [3] and [5], respectively, for basic aspects of real analysis and measure theory 
(including notation and terms not defined here). Notably, integration shall be always understood 
in the sense of Lebesgue, measures will take only non-negative real values, and the only topology 
considered on [subsets of] R will be the [relative topology induced by the] usual topology. 
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If / : X —» Y is a function and S' is a set, we denote by f[S] the (direct) image of S under 
/, namely the set {/(a;) : x G S} C Y, and by / _ 1 [S] the inverse image of S under /, namely 
the set {x G X : f(x) £S)CI. If, in addition, / is injective, then we use f~ l for the inverse 
of the function X —» f[X ] : x i —> f(x), viz. the corestriction of / to f[X], and by an abuse of 
language we refer to / -1 as the inverse of /. 

Given sigma-algebras s/ and SB, a function w \ U V is (si/, ^-measurable if U G si/, 
V G B6, and for every B G & there exists A G si/ such that w~ 1 [B fl V] = A fl U; in particular, 
w is called ^-measurable if it is (si/, ^-measurable with SB the Borel algebra on R. 

For a, b G R U {— oo, 00 } we write [a, b] for the closed interval {x G R : a < x < b}, [a, b[ for 
the semi-open interval [a, b\ \ {6}, and ]a, b[ for the open interval [a, b] \ {a, 6}. 

We say that a probability measure 7 : Y? —» R on a set 5 is non-degenerate if 7 ^ {0,1}, 
in which case we refer to the triplet (S, */>, 7 ) as a non-degenerate probability space. 

For X,Y CR, we write X <g Y to mean that X is contained in a compact subset of Y. 

3. Proof of Propositions 1 and 2 

First a remark. If I = [a, oo[ for some positive a G R and w is the function / —> R : x 1 —> a 
then f s w o hd'y = a r y(S) for every ^-measurable function h : S I, but 07 (S) G I if and 
only if 7 ( 5 ) = 1. This is the reason for having S, and not an arbitrary E G Y>, as a domain of 
integration in the statement of Proposition 1. 

Proof of Proposition 1. Set m := inf w[7], M := sup w[I\, and J := J s w o h d"f (the integral 
exists by the assumption that w o h is 7 -integrable). It follows from the monotonicity of the 
Lebesgue integral and the fact that 7 is a probability measure that m < J < M. 

Consequently, the only cases that we have to consider are when (i) m 7 ^ —00 and m i w[i\, 
or (ii) M ^ 00 and M ^ w[I\ (otherwise the claim is trivial, since —00 < J < 00 ), and both of 
these cases can be analyzed by essentially the same type of reasoning. Therefore, we restrict 
our attention to the latter and prove that J < M: This will lead to the desired conclusion, since 
]m, M[ C w[T\ C [m, M[ by the hypothesis that w[I] is an interval. 

To start with, let us define, for each n, the sets I n := {y G I : w(y) < M — C I and 
S n ■= h~ l [I n ] C S. Since M ^ w[I], we have that (I n )%Li is a countable covering of I, which in 
turn implies that (S ' n )^ =1 is a countable covering of S. 

On the other hand, S n C S n + 1 for every n, since clearly I n C I n+ 1 . Therefore, we get from 
[5, Proposition 1.5.12] that there must exist an integer v > 1 such that 7 (S n ) > 0 for all n > v, 
as otherwise we would obtain 1 = 7 (S) = lim „_>, 00 7 (S n ) = 0, i.e. a contradiction. 

Based on the above, let us define M t , := sup x6Su w o h(x). By construction, we have M„ < 
M - i < M, so the basic properties of integrals entail that 

J=wohd"f= wohd"f+ w o h d'y < M v 'y(S v ) + Mj(S \ S v ) < M, 

Js Js v Js\s v 


which suffices to complete the proof. 



On the commutation of generalized means on probability spaces 


5 


Proof of Proposition 2. (i) Since w is a continuous function, we have by [3, Theorem 5.3.10] that 
w[I] is an interval. Hence, Proposition 1 yields that, for every x £ U, the image of the function 

if : U -A R : x i—>- / w o h x d/3 

Jv 

is contained in w[I\, which proves that <p is well-defined, and hence <p[U] C I. Thus, assume for 
the remainder that h is ® ^-measurable and w o h is bounded. 

Then, by [5, Corollary 3.3.3], if is an ^/-measurable function, and since w^ 1 is continuous this 
is enough to conclude, in view of [5, Theorem 2.1.5(i)], that also p = w~ x oif is ,e/-measurable. 

(ii) By point (i) and our assumptions, we have that <p\U\ C /, p is ^/-measurable, and there 
is a compact set K C R such that h[U xF]C K C I. We are left to show that <p is bounded. 

To this end, we note that the continuity of w, together with the above considerations, yields 
that m < w(h(x,y )) < M for all (x,y) £ U x V, where m and M are, respectively, the minimum 
and maximum of w over K, which exist by Weierstrass’ (extreme value) theorem. Thus, we find 
from basic properties of integrals (cf. the proof of Proposition 1) that 

m < / w o h x d/3 < M for all x £ U. 

Jv 

On the other hand, u > -1 is a continuous function w[I\ —> I and J := [m, M] <s w[I]. So we get by 
[3, Theorem 5.3.10] and another application of Weierstrass’ theorem that <p[U] C w^ 1 [J] <g I. ■ 


4. Proof of the Main Theorem 


We split the proof into a series of three lemmas (recall that we are assuming / and g are 
continuous injections I —» R). 

We begin with the “if” part of the theorem, for which we first need the following elementary 
proposition. Throughout, t a ,p denotes, for a, /3 £ R, the affine function R — > R : x ax + /?. 

Proposition 3. Let (S, 'to 7 ) be a probability space and w : I — > R a [continuous] injection, 
and fix a, /? £ R, a / 0. Then, t a j 3 o w is a [continuous] injection, and $-y(t at p o w) = j5 7 (u;). 


Proof. Clearly, the function t a ^ow is [continuous and] injective, and its inverse is the function 
u > -1 o t a -i- a -ip. So we get from basic properties of integrals that, for all h £ £“( 7 ), 


i ?7 (t a ,p o u>)(h) = w 1 1 J (aw o h + fd) d~/ — a 1 /3^ 


,-1 


wohd'y + a 1 f3 / d"/— a x / 

' Js 


which, together with f s dj = 7 (S) = 1, implies $-y(t a! p o w) = $ 7 (w). 


With this in hand, we can prove the following result, which is straightforward by Remark 1, 
Proposition 3, and Fubini’s theorem, viz. [5, Theorem 3.4.4] (we omit further details): 


Lemma 1. For the pair ( f,g) to be a (A, p)-switch it is sufficient that there exist a,b £ R, 
a/0 such that f = ag + b, and necessary and sufficient that (af + b,cg + d) is a (A, p)-switch 
for all a,b,c,d £ R with ac 0. 
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Now we show how to reduce Equation (3) to the case where the probability spaces under 
consideration are discrete (i.e., the sigma-algebras of these spaces are discrete). 

Lemma 2. Let (f,g) be a (A, p)-switch, and suppose that A and p are non-degenerate proba¬ 
bilities measures. Then ( f,g ) is a discrete {X', p')~switch for some non-degenerate probability 
measures X 1 and p! on the set {0,1} C R. 

Proof. By hypothesis, there exist A gif and B £ such that 0 < A(H) < 1 and 0 < p{B) < 1. 
Let A c := X \ A and B c := Y \ B for notational convenience, and for a set Z C X x Y denote 
by 1 z the characteristic function 1x7 4(0,1} that maps an element z £ X xY to 1 if and 
only if z £ Z. It is clear that all the simple functions h : X x7 — > I of the form 

alAxB + blA^xB + cIaxBc + dlA^xB^, 

with a, 6 , c, d £ /, are £ <g> ^-measurable and such that h[X x Y] is contained in a compact 
subset of I (in fact, h[X x Y] is a finite set). Hence, taking V equal to the powerset of a 2-element 
set {0,1}, it is seen that (4) holds with respect to the probability measures A ',p' : V —> [0,oo] 
defined by A'({0}) := A(H) and /i'({0}) := p(B). ■ 

Lastly, we solve Equation (4) in the case where m = n = 2, which is sufficient for the “only 
if” part of the theorem to be proved, after the reduction implied by Lemma 2. 

We will need the following result, which belongs to the folklore, but whose short proof we 
include here for completeness and the sake of exposition: 


Proposition 4. Let V = (V, +, •) be a vector space over the real or complex field and D C V a 
convex set in V. Next, fix re £ ]0,1[ and let 4> : D —X R be a n-ajfine function, that is 

Vqi, q 2 e D : $(«qi + (1 - «0q 2 ) = rc$(qi) + (1 - «0$(q 2 )- (5) 

Then $ is { -affine (or Jensen-affine), namely (5) holds with k = 


Proof. Pick qi, q 2 £ D. It is then easily verified that 


X/x, y £ D : 


x + y 


x + y 


+ (1 - k)i I + (1 — k) ( ny + (1 - k) 


.x + y 


( 6 ) 


this is called the Daroczy-Pales identity, and, to the best of our knowledge, it has gone unnoticed, 
in spite of the straightforwardness of its proof, until a special case of it was used to prove [8, 
Lemma 1], In particular, ( 6 ) implies, together with (5) and the convexity of D, that 


$ 


qi + q2 


= k<E> k 


qi + q 2 


+ (1 - «:)qi ) + (1 - k) 4>( «q 2 + (1 - k) 


, qi + q 2 


= (k 2 + (1 - k) 2 )4> 


qi + q 2 


+ 2k( 1 — re) 


4>(qi) + $(q 2 ) 


which, by the arbitrariness of qi, q 2 £ D , leads to the desired conclusion. 


Lemma 3. Fix a, (3 £ ]0,1[, and let f and g be such that, for all x, y,z,w £ I, 


f 1 (af(g 1 {/3g(x) + {l-/3)g(y))) + {l-a)f(g 1 (/3g(z) + (1 - /3)g(w)))) 

= 5 -1 (^(/ _1 (a/ 0 r) + (1 - a)f(z))) + (1 - (d)g{f~ 1 ( K af{y) + (1 - a)f{w)))), 


( 7 ) 
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i.e. (/, g) is a (a; /3)-weighted switch. There then exist a, b £ R, a ^ 0 such that f = ag + b. 

Proof. Pick an arbitrary Co £ /. By (the second half of) Lemma 1, there is no loss of generality 
in assuming, as we do, that /(£o) = <?(Co) = 0 . Since the claim is obvious if I = {Co}, we also 
suppose for the remainder of this proof that I \ {Co} is non-empty, and let Xo € I \ {Co}- 

By these assumptions and the injectivity of / and g 1 neither of f(x o) or g( xq) is zero, and 
we can “normalize” / and g , respectively, to the functions 

fo '■ lo > R- : x H► ’[j' ^ (8) 

f{x 0 ) 

and 

go '■ lo —> R- : x (9) 

g{x o) 

where Iq := [cco,Co] U [Co,£o] (notice that we do not know whether xo > Co or %o < Co)- 

Again by Lemma 1, the pair (/o, 5 o) is still a (a ; /3)-weighted switch, and we want to show 
that this implies fo = go- To start with, note that, by construction and [3, Theorem 5.3.10], 

/o [To] = So [Ai] = [ 0 , 1 ]. ( 10 ) 

Pick s,t,u,v £ [0,1]. Letting x = 1 (s), y = g^it), z = ^(w), and w = g^fv) in (7) gives 

F(s, t , u, v ) = G(s, t, u, v), 
where for ease of notation we have put 

F(s, t, u, v) := / 0 _1 (a / 0 o g- 1 (/3s + (1 - /3)t) + (1 - a)f 0 o g- 1 (/3u + (1 - /3)v)) 

and 

G{s, t, u,v) := g^ifigo o / 0 _1 (a / 0 o (s) + (1 - a)f 0 o g^iu)) 

+ (1 - 0)9o 0 fo\nfoog-\t) + (1 - a)/ 0 oj 0 _ 1 (D))). 

Consequently, we have, for all s,t,u,v £ [0,1], that go o F(s,t,u,v) = go ° G(s,t,u,v), which, 
assuming ip := go o Jq 1 (so that p~ x = fo ° go -1 ), is equivalent to saying that 
p(ap~ 1 (/3s + (1 - /3)t) + (1 - a)p -1 (/3u + (1 - /3)v)) 

= j3p{ap -1 (s) + (1 - a)p -1 (u)) + (1 - /3)p(ap -1 (t) + (1 - ot)p~ l {v)). 

Notice that p is, by (10), a continuous bijection [0,1] —>• [0,1] for which </?(0) = 0 and </?(l) = 1 
(here is the reason for having normalized / to fo and g to go in the way we have done). 

On the other hand, if we introduce the function 

$ : [0, l ] 2 -A [0,1] : (p,q) i-A p{ap~ l {p) + (1 - a)p -1 (q)), (12) 

we can rewrite ( 11 ) in a more convenient form and find that 

V(s,i), (u,v) £ [0, l ] 2 : $(/3(s,f) + (1 - P){u,v)) = /3$(s,i) + (1 - 0)®(u,v). 

It follows from Proposition 4 that $ is ^-affine, and since $ is a continuous function [0, l ] 2 —> 
[ 0 , 1 ] (here is where we use that p is a continuous bijection [ 0 , 1 ] —» [ 0 , 1 ]), this in turn is enough 
to conclude, by [16, Theorem 13.2.2], that there exist some A, B, C £ R such that 

Vw, v £ [0,1] : $(«, v) = Au + Bv + C. 
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In particular, ( A + B)u + C = 4>(it, u) = u for all u £ [0,1] and ]0,1[ 9 <p(cx) = 4>(1,0 ) = A + C, 
which yields B = 1 — A, C = 0, and 0 < A < 1 . So, by (12) and the bijectivity of ip, we see that 


Vu, v £ [0, 1 ] : ip(au + (1 — a)v) = A<p(u ) + (1 — A)ip(v). 


To wit, ip is an injective (a, A)- affine function [0,1] —> [0,1], which implies by [20, Theorem K] 
that ip is an injective affine function [0,1] —> [0,1], viz. there exist a, b £ R, a ^ 0 such that 
f(u) = au + b for all u £ [ 0 , 1 ]. 

Taking into account that ip = g 0 o / 0 -1 , / 0 (£ 0 ) = g 0 (£o) = 0, and / 0 (x 0 ) = go(x 0 ) = 1, it then 
follows that go = fo, as desired. But this, together with (8) and (9), shows that 


f(x) 


f(x o) 
g(x o) 


g(x) 


for every x £ Iq and every Xo £ / \ {Co}- So the quotient f(x)/g(x) is constant for x £ I \ {Co}j 
and since /(Co) = g{ Co) = 0 , we find that f = ag for some a £ R, a ^ 0 . ■ 


5. Closing remarks 

It seems interesting to try to solve Equation (4) in the presence of constraints on the entries 
of the matrix E = (Ci,j)i<i<m,i<j<n that appear on the left- and right-hand side of the same 
equation: E.g., we may require that the matrix S has rank k (or < k) for some positive integer 
k < min(m, n), or is square and symmetric (respectively, circulant, triangular, or whatsoever), 
and for each of these cases we may ask whether or not it is still true that the pair (/, g) is a 
solution to (4) if and only if / = ag + b for some a, b £ R, a ^ 0. 
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